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Abstract 



This study presents a novel technique to estimate the computational complexity of sequential 
decoding using the Berry-Esseen theorem. Unlike the theoretical bounds determined by the conventional 
central limit theorem argument, which often holds only for sufficiently large codeword length, the 
new bound obtained from the Berry-Esseen theorem is valid for any blocklength. The accuracy of the 
new bound is then examined for two sequential decoding algorithms, an ordering-free variant of the 
generalized Dijkstra's algorithm (GDA)(or simplified GDA) and the maximum-likelihood sequential 
decoding algorithm (MLSDA). Empirically investigating codes of small blocklength reveals that the 
theoretical upper bound for the simplified GDA almost matches the simulation results as the signal-to- 
noise ratio (SNR) per information bit (7^) is greater than or equal to 8 dB. However, the theoretical 
bound may become markedly higher than the simulated average complexity when 7b is small. For the 
MLSDA, the theoretical upper bound is quite close to the simulation results for both high SNR (75 > 6 
dB) and low SNR (7^ < 2 dB). Even for moderate SNR, the simulation results and the theoretical 
bound differ by at most 0.8 on a log 10 scale. 



Coding, Decoding, Large Deviations, Convolutional Codes, Maximum-Likelihood, Soft-Decision, 
Sequential Decoding 



The Berry-Esseen theorem [6, sec.XVI. 5] states that the distribution of the sum of independent 
zero-mean random variables {Xj}™ =1 , normalized by the standard deviation of the sum, differs 
from the unit Gaussian distribution by no more than C r n /s^, where and r n are, respectively, 
the sums of the marginal variances and the marginal absolute third moments, and the Berry- 
Esseen coefficient, C, is an absolute constant. Specifically, for every a e 3ft, 



where $(•) represents the unit Gaussian cumulative distribution function (cdf). The remarkable 
aspect of this theorem is that the upper bound depends only on the variance and the absolute 
third moment, and therefore, can provide a good probability estimate through the first three 
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moments. A typical estimate of the absolute constant is six [6, sec.XVI. 5, Thm. 2]. When 
{X n }f =1 are identically distributed, in addition to independent, the absolute constant can be 
reduced to three, and has been reported to be improved down to 2.05 [6, sec.XVI. 5, Thm. 1]. In 
1972, Beek sharpened the constant to 0.7975 [2]. Later, Shiganov further improved the constant 
down to 0.7915 for an independent sample sum, and, 0.7655, if these samples are also identically 
distributed [25]. Shiganov's result is generally considered to be the best result yet obtained thus 
far [24]. 

In applying this inequality to analyze the computational complexity of sequential decoding 
algorithms, the original analytical problem is first transformed into one that concerns the asymp- 
totic probability mass of the sum of independent random samples. Inequality (OQ) can therefore 
be applied. The complexities of two sequential maximum-likelihood decoding algorithms are 
then analyzed. One is an ordering-free variant of the generalized Dijkstra's algorithm (GDA) 
[14] operated over a code tree of linear block codes, and the other is the maximum-likelihood 
sequential decoding algorithm (MLSDA) [13] that searches for the codeword over a trellis of 
binary convolutional codes. 

The computational effort required by sequential decoding is conventionally determined using a 
random coding technique, which averages the computational effort over the ensemble of random 
tree codes [16], [18], [23]. Branching process analysis on sequential decoding complexity has 
been recently proposed [10], [19], [20]; the results, however, were still derived by averaging over 
semi-random tree codes. Chevillat and Costello proposed to analyze the computational effort of 
sequential decoding in terms of the column distance function of a specific time-invariant code 
[4]; but, the analysis only applied to a situation in which the code was transmitted via binary 
symmetric channels. 

In light of the Berry-Esseen inequality and the large deviations technique, this work presents 
an alternative approach to derive the theoretical upper bounds on the computational effort 
of the simplified GDA and the MLSDA for binary codes antipodally transmitted through an 
additive white Gaussian noise (AWGN) channel. Unlike the bounds established in terms of 
the conventional central limit theorem argument, which often holds only for sufficiently large 
codeword length, the new bound is valid for any blocklength. Empirically investigating codes of 
small blocklength shows that for the trellis-based MLSDA, the theoretical upper bound is quite 
close to the simulation results for both high SNR and low SNR; even for moderate SNR, the 
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theoretical upper bound and the simulation results differ by no more than 0.579966 on a log 10 
scale. For the tree-based ordering-free GDA, the theoretical bound coincides with the simulation 
results at high SNR; however, the bound tends to be substantially larger than the simulation 
results at very low SNR. The possible cause of the inaccuracy of the bound at low SNR for the 
tree-based ordering-free GDA is addressed at the end of this study. 

The rest of this paper is organized as follows. Section [II] derives a probability bound for use 
of analyzing the sequential decoding complexity due to the Berry-Esseen inequality. Section [III] 
presents an analysis of the average computational complexity of the GDA. Section [IV] briefly 
introduces the MLSDA, and then analyzes its complexity upper bound. Conclusions are finally 
drawn in Section |V} 

Throughout this article, <£>(■) denotes the unit Gaussian cdf. 



This section derives an upper probability bound for the sum of independent random samples 
using the Berry-Esseen inequality. This bound is essential to the analysis of the computational 
effort of sequential decoding algorithms. 

The approach used here is the large deviations technique, which is generally applied to com- 
pute the exponent of an exponentially decaying probability mass. The Berry-Esseen inequality is 
also applied to evaluate the subexponential detail of the concerned probability. With these two 
techniques, an upper bound of the concerned probability can be established. 

Lemma 1: Let Y n = YH=i-^i be the sum of i.i.d. random variables whose marginal distribu- 
tion is F(-). Define the twisted distribution with parameter 9 corresponding to F(-) as: 



where M{9) = E[e eXl ]. Let the random variable with probability distribution F"(-) be X™\ 
Then, for every 9 < 0, 



II. Berry-Esseen Theorem and probability bound 



dFV\x) 



A exp{9x} dF{x) 



M{9) 



P*{Y n < -na} < A n (9,a)e ean M n (9), 



where A n (9, a) = mm{B n (9, a), 1}, 

r t(9) 



e -(ti(9)+ a ) 2 n/[2a 2 (e)] + 2( j 



P(0) 



if a > 9a 2 



(0) 



Bn(M) A J y/2^M6) + a) - 6o*{6)] 



a 3 (9)^E 



e e[eaHe)-2(ii(9)+a)]n/2 + %Q 



a 3 (9)^ 



otherwise, 



p(0) = E[Xl% a 2 {9) = E[\X^-^6)\% p{6) = E[\X^ - ^9)\*} 



and C = 0.7655. 



Proof: Define F^' \y) = Pv[X^' + X. 



(6) , 



/•/IN 

+ X„ < y\, and let the distribution of 



[(X[ e) - /2(e)) + ■■■ + (X [ n ] - fJ.(9))]/[a(6)y/n\ be H n (-), where in the evaluation of the above 
two statistics, {xf }^ =1 are assumed independent with common marginal distribution F^(-). 
Then, by denoting Y^ 0) = x{ e) + X { 2 9) + ■ ■ 



Xn \ we obtain: 



Pr (Y n <-na) = [ 

J[xi 



[xiH \-x n <-na] 

M n {6) 



dF(x 1 )dF(x 2 ) ■ ■ ■ dF(x n ) 

e- d{xi+ - +x " ) dF {d) (x 1 )dF < - 9 \x 2 ) ■ ■ ■ dF^(x r , 

;„<—na] 
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M n (9) 
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where 1{-} is the set indicator function, and © follows from H n (y) = Fn\cr(Q)y/ny + fi(6)n). 
Integrating by parts on © with A(dy) = —9a(9)^/nexp{—6a(8)^/n[y+(fx(9)+a)^/n/a(8)}}dy 
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defined over (—00, — (fi(9) + a)y/n/a(9)}, and then applying equation £0 yields 

(fi(e)+a)y/E/a(e) 

e -ea(e) y /E[y+( i ,(e)+aWn/a(e)] dHn ^ 



(4) 



< 
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a\9)^i 

where © holds by, again, applying integration by part, and © follows from 

$(-«) < 



if a > fla 2 (fl) -//(0); 

(6) 

otherwise, 
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e~ u 12 and $(«)<! for u > 0. 



It remains to show that 



-( At (0)+a) v / ^/^(e) 



which be established by observing that 

r -{ii{9)+ a )^n/cj(9) 

e 6an M n (9) / g-fct^Ms+tKej+alA/^s)]^^) = Pr {y n < _ na } (7) 
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Some remarks are made following Lemma Q] as follows. First, the upper probability bound in 
Lemma \T\ consists of two parts, the exponentially decaying e 9an M n (9) and the subexponentially 
bounded A n (9, a). When a > 9a 2 (9) - ji(9) and a ^ -fJ,(9), 



BJ9,a) 



a(9) 
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since the first term decays exponentially fast, and B n (9, a) reduces to the Berry-Esseen proba- 
bility bound. However, when 9 is taken to satisfy fx(9) = —a, 



B n (6,a) 



2C- 



and a larger bound (than the Berry-Esseen one) is resulted. In either case, B n (9,a) vanishes 
exactly at the speed of 1/y/n. Secondly, when A n (0, a) = 1, the upper probability bound reduces 
to the simple Chernoff bound e 9an M n (9) for which a four- line proof from © to ([8]) is sufficient 
[8, Eq. (5.4.9)], and is always valid for every 9 < 0, regardless of whether a > 9a 2 (9) — fi(9) 
or not. 

The independent samples {Xj}™ =1 with which our decoding problems are concerned actually 
consist of two i.i.d. sequences, one of which is Gaussian distributed and the other is non- 
Gaussian distributed. One way to bound the desired probability of Pr[J^™ =1 X { < 0] is to directly 
use the Berry-Esseen inequality for independent but non-identical samples (which can be done 
following similar proof of Lemma d]). However, in order to manage a better bound, we will apply 
Lemma \T\ only to those non-Gaussian i.i.d. samples, and manipulate the remaining Gaussian 
samples directly by way of their known probability densities in the below lemma (cf. The 
derivation in ©). 

Lemma 2: Let Y n = Y^ =l Xi be the sum of independent random variables {Xj}™ =1 , among 
which {Xi}f =1 are identically Gaussian distributed with positive mean fi and non-zero variance 
a 2 , and {^i}" =rf+ i have common marginal distribution as minjX^O}. Let 7 = (l/2)(/z 2 /a 2 ). 
Then 

Pr{Y n <0} <B(d,n-d, 7 ), 
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and A is the unique solution (in [0, y/2rf)) of 

1 
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Proof: Only the bound for d < n is proved since the case of d = n can be easily 
substantiated. 
Let 

E[\XZ - E[XZ] 



^VarL^] and p{e) - W "^ 1W 



(7 



a" 



and let /t = E[X d+1 ]/a. By noting that (/x/cr) = v / 2~7, and for any 9 < satisfying that 

a±-fi-<T0a 2 (0) + fi(9) > 0, 
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P*(Y n < 0) can be bounded by 

Pr(K„ < 0) 

= PT{Xx + --- + X d + X d+1 + --- + X n <0} 

f°° 1 _( X -dnf 
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where M{6) = E[e 9Xd+1 ], and the last inequality follows from Lemma [Q Observe that 
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Thus, 
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where for a > 0, 



A n _ d (0) = min 



2C- 



Now for 6 < and a < 0, we can use Chernoff bound in © instead, in which case the derivation 
up to ([TO]) similarly follows with A n _ d (9) = 1. 
We then note that 

is exactly the moment generating function of Y n = Y17=i nence > if E[Y n ] = dfj,+(n — d)ap, > 
0, then the solution 9 of dE[e 9Y "]/d9 = is definitely negative. 

For notational convenience, we let A = (ji/cr) + a9 = y/2rf + a6, and yield that 



M(0) = $ (-A) e"V ' 2 + $( yffi) and e °°Wy+o*° 2 /2 = e -7 e A 2 /2. 



Accordingly, the chosen A = 1/27 + 06 should satisfy 
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As it turns out, the solution A = A (7) of the above equation depends only on 7. Now, by 
replacing e (1/2)A2 $(-A) with (1 - d/n) /(y/2nX) - (rf/n)e 7 $( v / 27), we obtain 
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Hence, the previously obtained upper bound for Pr(F n < 0) can be reformulated as 

(n - d)fi + dv/27 s 
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where 

i n _ d (A) = min I l{a > 0} 
Finally, a simple derivation yields 
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E[Y n ] = dE[X 1 } + (n-d)E[X d+l ] 

= a(d^+(n-d) [-(l/v / 2T)e" 7 + ^$(-^2^)]) 
and hence, the condition of E[Y n ] > can be equivalently replaced by 



d > y/An^e 1 

n ~ 1 + v /47T7eT$( v /27) 



Again, if the simple Chernoff inequality is used instead in the derivation of ©, the bound 
remains of the same form in Lemma [2] except that v4 n __d(A) is always equal to one. 

Empirical evaluations of A n _ d {\) in Figs. Q] and [2] indicates that when the sample number 
n < 50, A n _ d (X) will be close to 1, and the subexponential analysis based on the Berry-Esseen 
inequality does not help improving the upper probability bound. However, for a slightly larger 
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n such as n = 200, a visible reduction in the probability bound can be obtained through the 
introduction of the Berry-Esseen inequality. 

One of the main studied subjects in this paper is to examine whether the introduction of 
the subexponential analysis can help improving the complexity bound at practical code length. 
The observation from Figs. Q] and [2] does coincide with what we obtained in later applications. 
That is, some visible improvement in complexity bound can really be obtained for a little larger 
codeword length in the MLSDA (specifically, N = 2(60 + 6) or 2(100 + 6)). However, since the 
simulated codes are only of lengths 24 and 48, no improvement can be observed for the GDA 
algorithm. 

d/n = 0.2 

1 7 = idB 



7 = -ldB 
7 = -3dB 
7 = -5dB 



50 100 150 200 250 300 350 400 

n 

Fig. 1. A n _£j(A) for fixed d/n = 0.2 with respect to different 7. Notation "1(0)" represents that the y-tic is either 1 (for the 
curve below) or (for the curve above). 

We end this section by presenting the operational meanings of the three arguments in function 
£>(•,-,•) before their practice in subsequent sections. When in use for sequential-type decoding 
complexity analysis, the first integer argument is the Hamming distance between the transmitted 
codeword and the examined codeword up to the level of the currently visited tree node. The 
second integer argument represents a prediction of the future route, which is not yet occurred, 
and hence in our complexity analysis, is always equal to the maximum length of the codewords 
(resp. n for GDA algorithm and N for MLSDA algorithm) minus the length of the codeword 
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1(0) 
1(0) 
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"i 1 1 1 1 77 — I — 7TT 

. m _ d/n = 0.4 



50 100 150 
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n 



d/n = 0.3 



d/n = 0.2 



d/n = 0.1 



250 300 350 400 



Fig. 2. A n _£j(A) for fixed 7 = —3dB with respect to different d/n ratios. Notation "1(0)" represents that the y-tic is either 
1 (for the curve below) or (for the curve above). 



portion of the current visited node (resp. t for GDA algorithm and In for MLSDA algorithm)H| 
The third argument is exactly the signal-to-noise ratio for the decoding environment, and is 
reasonably assumed to be always positive. 

III. Analysis of the Computational Effort of the Simplified Generalized 

Dijkstra's Algorithm 

In 1993, a novel and fast maximum-likelihood soft-decision decoding algorithm for linear 
block codes was proposed in [14], and was called the generalized Dijkstra's algorithm (GDA). 
Computer simulations have shown that the algorithm is highly efficient (that is, with small 
average computational effort) for certain number of linear block codes [5], [14]. Improvements 
of the GDA have been subsequently reported [1], [5], [9], [11], [15], [21], [26]. 

1 The metric for use of sequential-type decoding can be generally divided into two parts, where the first part is determined by the 
past branches traversed thus far, while the second part helps predicting the future route to speed up the code search process [12]. 
For example, by adding a constant term YliLi 1°62 P r (2/0 to tne accumulant Fano metric 5TJi=i (l°g2p r (yj P r (j/j)] ~ R) 
up to level q, it can be seen that $7Ji=i (^^(^(yA^j) ~ R) weights the history, and JTJfcq+i 1°S2 P r (2/j) is me expectation 
of branch metrics to be added for possible future routes. Based on the intuition, the first argument and the second argument 
respectively realize the historical known part and the future predictive part of the decoding metric. 
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The authors of [15] proposed an upper bound on the average computational effort of an 
ordering-free variant of the GDA for linear block codes antipodally transmitted via the AWGN 
channel; however, the bound is valid only for codes with sufficiently large codeword length. In 
terms of the large deviations technique and Berry-Esseen inequality, an alternative upper bound 
that holds for any (thus including, small) codeword length can be given. 

A. Notations and definitions 

Let rQ be an (n, k) binary linear block code with codeword length n and dimension k, and let 
R = k/nbe the code rate of rQ. Denote the codeword of rQ by x = (x , a?i, ar n — i)- Also, 
denote by r = (r , r 1; . . . , r„_i) the received vector due to a codeword of rQ is transmitted via 
a time-discrete memoryless channel. 

From [3] (also [27], [28]), the maximum-likelihood (ML) estimate x=(xq, x±, . . ., x n -i) for 
a time-discrete memoryless channel, upon the receipt of r, satisfies 

n—l n—l 

- (-l)^) 2 < J2 - ("i)^) 2 for all xerQ, (12) 

3=0 j=0 

where <pj = m[Pr(r\,|0)/ Pr(rj|l)]. An immediate implication of equation (fT2l) is that using the 
log-likelihood ratio vector <f> = (0 O , <j>i, ■ ■ ■ , 4>n-i) rather than the received vector r is sufficient 
in ML decoding. 

When the linear block code is antipodally transmitted through the AWGN channel, the rela- 
tionship between the binary codeword x and the received vector r can be characterized by 

rj =(-i) Xi VE + ej forO<j<n-l, (13) 

where E is the signal energy per channel bit, and ej represents a noise sample of a Gaussian 
process with single-sided noise power per hertz N . The signal-to-noise ratio for the channel is 
therefore 7 = E/N . In order to account for the code redundancy for different code rates, the 
SNR per information bit 7^ = 7/i? is used instead of 7 in the following discussions. 

A code tree of an (n, k) binary linear block code is formed by representing every codeword 
as a code path on a binary tree of (n + 1) levels. A code path is a particular path that begins at 
the start node at level 0, and ends at one of the leaf nodes at level n. There are two branches, 
respectively labelled by and 1, that leave each node at the first k levels. The remaining nodes 
at levels k through (n — l) consist of only a single leaving branch. The 2 k rightmost nodes 
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at level n are referred to as goal nodes. In notation, x\g\ is used to denote a path labelled 
by (xo,xi, . . . ,X£-i). For notational convenience, the subscript "[n]" is dropped for the label 
sequence of a code path, namely xr n i is briefed by x. The same notational convention is adopted 
for other notation including the received vector r and the log-likelihood ratio vector (/>. 

B. Brief description of the GDA 

For completeness, we brief the GDA decoding algorithm in [14] in this subsection. 

After obtaining the log-likelihood ratio vector 4> = (0 O , 4>ii ■ ■ ■ > 0n-i)> m e GDA algorithm 
first permutes the positions of codeword components such that the codeword component that 
corresponds to larger absolute value of log-likelihood ratio appears earlier in its position whenever 
possible, and still the first k positions uniquely determine a code path. The post-permutation 
codewords thereby result in a new code tree & . Let <f>* = (0q,0i, • • • , 4>n-i) be the new 
log-likelihood ratio vector after permutation, and define the path metric of a path xw (over the 
new code tree ^*) as ^2~^((j>j ~~ ( — l) Xj ) 2 - The path metric of a code path x is thus given by 
Y^jZo{4>*j ~~ (— l)^ 3 ) 2 - The algorithm then searches for the code path with the minimum path 
metric over^C*, which, from equation (fT2l) . is exactly the code path labelled by the permuted 
ML codeword. As expected, the final step of the algorithm is to output the de-permuted version 
of the labels of the minimum-metric code path. 

The search process of the GDA algorithm is guided by an evaluation function /(•), defined for 
all paths of a code tree. A simple evaluation function [11] that guarantees the ultimate finding 
of the minimum-metric code path is 



Hence, when a path xm is extended to its immediate successor path asw+ii, the evaluation function 



The algorithm begins the search from the path that contains only the start node. It then extends, 
among the paths that have been visited, the path with the smallest /-function value. Once the 
algorithm chooses to extend a path that ends at a goal node, the search process terminates. 
Notably, any path that ends at level k has already uniquely determined a code path. Hence, 
once a length-A; path is visited and the /-function value associated with its respective code path 
does not exceed the associated /-function value of any of the later top paths in the stack, the 




(14) 



value is updated by adding the branch metric, (0| — ( — l) Xe ) 2 



(|0|| — l) 2 , to its original value. 
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algorithm can ensure that this code path is the targeted one with the minimum code path metric. 
This indicates that the computational complexity of the GDA is dominantly contributed by those 
paths up to level k. This justifies our later analysis of the decoding complexity of the GDA, 
where only the computations due to those paths up to level k are considered. 

The simplified GDA algorithm is an unpermuted variant of the GDA algorithm. In other 
words, its codeword search is operated over the unpermuted original code tree rQ. Although both 
algorithms yield the same output, the simplified one was demonstrated to involve a larger branch 
metric computational load [15]. We quote the algorithm below. 

Step 1. Put the path that contains only the start node of the code tree into the Stack, and assign 
its evaluation function value as zero. 

Step 2. Compute the evaluation function value (as in (fl4l) ) for each of the successor paths of 
the top path X[g\ in the Stack by adding the branch metric of the extended branch to the 
evaluation function value of the top path. Delete the top path from the Stack. 

Step 3. Insert the successor paths into the Stack in order of ascending evaluation function value. 

Step 4. If the top path in the Stack ends at a goal node, output the codeword corresponding to 
the top path, and the algorithm stops; otherwise go to Step 2. 

It can be seen from the above algorithm that the simplified GDA algorithm resembles the 
stack algorithm except that it uses the evaluation function in (fT4l) instead of the Fano metric 
to guide the search on the code tree, and is designed to decode the block codes rather than 
the convolutional codes. In addition, the simplified GDA algorithm is maximum-likelihood in 
performance as contrary to the sub-optimality of the stack algorithm. 

C. Analysis of the computational effort of the simplified GDA 

The computational effort of the simplified GDA can now be analyzed. 

Theorem 1 (Complexity of the simplified GDA): Consider an (n,k) binary linear block code 
antipodally transmitted via an AWGN channel. The average number of branch metric computa- 
tions evaluated by the simplified GDA, denoted by LscDAilb), is upper-bounded by 



where function B(-,-,-) is defined in Lemma [21 

Proof: Assume without loss of generality that the all-zero codeword is transmitted. 




(15) 
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Let x* label the minimum-metric code path for a given log-likelihood ratio vector </>. Then 
we quote from [15] that for any path X[g\ selected for extension by the simplified GDA, 



n-l 



/(« M i0)< j3(^-(-i)^r, 

j=0 

which implies that for I < k, 

Pr [path X[£] is extended by the simplified GDA] 
< Pr 



n-l 



< Pr 



/(* M |0)<X>i-(- 1 )" ? ) 

3=0 
n-l 

/(^|0)<^(^--(-l) ) 2 



j=0 



(16) 
(17) 
(18) 



~l-l n-l n-l 

= Pr £to-(-in 2 + £(|&|- I) 2 <£(&"!) 

.5=0 j=(? i=0 

where (fTVl) follows from the assumption that the path metric of the a;*-labelled code path is the 
smallest with respect to (ft, and hence, does not exceed that of the 0-labelled code path. 

Now denote by J = J{xyn) the set of index j, where < j < £ — 1, for which Xj = 1. Then 
(U~8l can be rewritten as 

Pr [path is extended by the simplified GDA] 



< Pr 



Pr 



n-l 



J>,- + 5>in(<M)<0 



3=1 
n-l 



r i + min(rj, 0) < 



(19) 



where (fT9l ) holds since for the AWGN channel specified in (TT3T ), 0j = 4y/Erj/N . As the all-zero 
codeword is assumed to be transmitted, r-j is Gaussian distributed with mean \[E and variance 
No/ 2. Hence, Lemma [2] can be applied to obtain 

Pr [path is extended by the simplified GDA] < B (d,n — £, Rjb) , 

where d = \J\ is the Hamming weight of x^y 

Observe that the extension of each path that ends at level t, where t < k, causes two branch 
metric computations. Therefore, the expectation value of the number of branch metric evaluations 
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satisfies 

fe-i i 



-^SGDA 

1=0 d=0 ^ ' 



D. Numerical and simulation results 

The accuracy of the previously derived theoretical upper bound for the average computational 
effort of the simplified GDA is now empirically studied. Two linear block codes are considered 
— one is a (24, 12) binary extended Golay code, and the other is a (48, 24) binary extended 
quadratic residue code. 

Figures [3] and @] illustrate the deviation between the simulated results and the theoretical 
upper bound in Theorem [TJ Only one theoretical curve (rather than one enhanced by Berry- 
Esseen analysis and the other with simple Chernoff-based analysis) is plotted in the two figures 
because no improvement in function £?(•,-,•) can be obtained by the introduction of the Berry - 
Esseen analysis. According to these figures, the theoretical upper bound is quite close to the 
simulation results for high 7b (above 8 dB). In such a case, the computational complexity of the 
simplified GDA reduces to its minimum possible values, 24 and 48, for (24, 12) and (48, 24) 
codes, respectively. As 7b reaches 1 dB, the theoretical bound for (48, 24) code is around 12 times 
higher than the simulated average complexity. However, for the (24, 12) code, the theoretical 
bound and the simulation results differ only by 0.671638 on a logi scale at 7b = 1 dB, and it is 
when 7b < —8 dB that the upper bound becomes ten times larger than the simulated complexity. 
The conclusion section will address the possible cause of the inaccuracy of the theoretical bounds 
at low SNR. 

IV. Analysis of the Computational Effort of the MLSDA 

Based on the probability bound established in Lemma [2l the computational complexity of the 
maximum-likelihood sequential decoding algorithm (MLSDA) proposed in [13] is analyzed for 
convolutional codes antipodally transmitted via the AWGN channel. 

A. Notation and definitions 

Let rQ be an (n, k, m) binary convolutional code, where k is the number of encoder inputs, n 
is the number of encoder outputs, and m is its memory order defined as the maximum number of 
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Fig. 3. Average computational complexity of the simplified GDA for (24, 12) binary extended Golay code. 



shift register stages from an encoder input to an encoder output. Let R = k/n and N = n(L + 
m) be the code rate and the code length of respectively, where L represents the length of 
applied information sequence. Denote the codeword of^by x = (xo, xi, xn-i). Also denote 
the left portion of codeword x by xn,) = (xo,xi, . . . , Assume that antipodal signaling is 
used in the codeword transmission such that the relationship between binary channel codeword 
x and received vector r = (ro, r±, . . . , rjy-i) is 

rj = (_i)^v^ + e i; < j < JV - 1, (20) 

where is the signal energy per channel bit, and ej is a noise sample of a Gaussian process 
with single-sided noise power per hertz N . The signal-to-noise ratio per information bit ^ b = 
(EN) j (N kL) is again used to account for the code redundancy for various code rates. 

A trellis, as depicted in Fig. [5] in terms of a specific example, can be obtained from a code 
tree by combining nodes with the same state. States are characterized by the content of the 
shift-register stages in a convolutional encoder. For convenience, the leftmost node (at level 0) 
and the rightmost node (at level L + m) of a trellis are named the start node and the goal node, 
respectively. A path on a trellis from the single start node to the single goal node is called a 
code path. Each branch in the trellis is labelled by an appropriate encoder output of length n. 

19 



lc+006 



V, 

3 

o 

o 



5 

3 
o 
U 



100000 r 



10000 r 



1000 



100 



10 



1 1 1 1 1 1 1 1 1 — 

o Theoretical bound on the GDA ■ ■ » 

' ■ q Simulation using the GDA ■ ■ + 

o. 

'o 

■■+.. 

''■■+. 

°. 

'■■+. X 

"0, 

'■■+. '- G -. 



o . 



6 

lb 



10 



11 



Fig. 4. Average computational complexity of the simplified GDA for (48, 24) binary extended quadratic residue code. 



B. Maximum-likelihood soft-decision sequential decoding algorithm (MLSDA) 

In [13], a trellis-based sequential decoding algorithm specifically for binary convolutional 
codes is proposed. The same paper proves that the algorithm performs maximum-likelihood 
decoding, and is thus named the maximum-likelihood sequential decoding algorithm (MLSDA). 
Unlike the conventional sequential decoding algorithm [7], [17], [22], [29] which requires only 
a single stack, the trellis-based MLSDA maintains two stacks — an Open Stack and a Closed 
Stack. For completeness, the algorithm is quoted below. 

Step 1. Put the path that contains only the start node into the Open Stack, and assign its path 
metric as zero. 

Step 2. Compute the path metric for each of the successor paths of the top path in the Open 
Stack by adding the branch metric of the extended branch to the path metric of the top 
path. Put into the Closed Stack both the state and level of the end node of the top path in 
the Open Stack. Delete the top path from the Open Stack. 

Step 3. Discard any successor path that ends at a node that has the same state and level as any 
entry in the Closed Stack. If any successor path merged with a path already in the Open 

2 "Merging" of two paths means that the two paths end at the same node. 
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level £ 1 2 3 4 5 6 7 



Fig. 5. Trellis for a (3, 1, 2) binary convolutional code with information length L = 5. In this case, 
the code rate R — 1/3 and the codeword length N = 3(5 + 2) = 21. The code path indicated 
by the thick line is labelled by 111, 010, 001, 110, 100, 101 and 011, thus its corresponding 
codeword is x = (111010001110100101011). 

Stack, eliminate the path with higher path metric. 
Step 4. Insert the remaining successor paths into the Open Stack in order of ascending path 
metrics. 

Step 5. If the top path in the Open Stack ends at the single goal node, the algorithm stops and 
output the codeword corresponding to the top path; otherwise go to Step 2. 

We remark after the presentation of the MLSDA that the Open Stack contains all paths having 
been visited thus far, but excludes all prefixes of the paths in it. Hence, the Open Stack functions 
in a similar way as the stack in the conventional sequential decoding algorithm. The Closed Stack 
keeps the information of ending states and ending levels of those paths that had been the top 
paths of the Open Stack at some previous time. In addition, the path metric for a path labelled 
by X(in-i) = Oo, xi, . . ., xtn-i), upon receipt of ^{tn-iy is given by 



j=0 

where fa = log[Pr(rj|0)/ Pr(r 3 -|l)] is the jth received log-likelihood ratio, r - is the jth received 
scalar, and y 3 - — 1 if fa < and yj = 0, otherwise. 




(21) 
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C. Analysis of the computational efforts of the MLSDA 

Since the nodes at levels L through (L + m — 1) have only one branch leaving them, and L is 
typically much larger than m, the contribution of these nodes to the computational complexity 
due to path extensions can be reasonably neglected. Hence, the analysis in the following theorem 
only considers those branch metric computations applied up to level L of the trellis. 

Notations that will be used in the next theorem are first introduced. Denote by Sj(£) the 
node that is located at level £ and corresponds to state index j. Let Sj(£) be the set of paths 
that end at node Sj(£). Also let Hj(£) be the set of the Hamming weights of the paths in 
Sj(£). Denote the minimum Hamming weight in 7ij(£) by d*A£). As an example, <S 3 (3) equals 
{111010001,000111010} in Fig. El which results in H 3 (3) = {5,4} and d*(3) = 4. 

Theorem 2 ( Complexity of the MLSDA): Consider an (n, k, m) binary convolutional code trans- 
mitted via an AWGN channel. The average number of branch metric computations evaluated by 
the MLSDA, denoted by £mlsda(76)> is upper-bounded by 



in, kLj b /N) = 0. 

Proof: Assume without loss of generality that the all-zero codeword is transmitted. 

First, observe that for any two paths that end at a common node, only one of them will survive 
in the Open Stack. In other words, one of the two paths will be discarded either due to a larger 
path metric or because its end node has the same state and level as an entry in the Closed Stack. 
In the latter case, the surviving path has clearly reached the common end node earlier, and has 
already been extended by the MLSDA at some previous time (so that the state and level of its 
end node has already been stored in the Closed Stack). Accordingly, unlike the code tree search 
in the GDA, the branch metric computations that follow these two paths will only be performed 
once. It therefore suffices to derive the computational complexity of the MLSDA based on the 
nodes that have been extended rather than the paths that have been extended. 

Let x* label the minimum-metric code path for a given log-likelihood ratio <fi. Then we claim 
that if a node Sj(£) is extended by the MLSDA, given that cc^ n _i) is the only surviving path (in 



where if Hj(£) is empty, implying the non-existence of state j at level £, then B(d*(£) 1 N 
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the Open Stack) that ends at this node at the time this node is extended, then 

C(^n-l) |0(^-1)) <C(**|0) (22) 

The validity of the above claim can be simply proved by contradiction. Suppose |0(fa-i)) 

> ((x*\cf>). Then the non-negativity of the individual metric (yj®Xj)\(j)j\, which implies £(aj*|0) 

> C 0( 6) !<£(&)) for every < b < N - 1, immediately gives C(*(&»-i)l^n-i)) > (i x *b)\^(b)) 
for every < b < N — 1. Therefore, £C(^ n _i) cannot be on top of the Open Stack (because some 
x*, h s always exists in the Open Stack), and hence violates the assumption that Sj(£) is extended 
by the MLSDA. 

For notational convenience, denote by A(-Sj(£), x^ n _i)) the event that "cc^ n _i) is the only path 
in the intersection of Sj(£) and the Open Stack at the time node Sj{£) is extended!' Notably, 

M(5 i (^) J aj(&wi))} tB(/n _ 1)65j -(<) 

are disjoint, and 

PT{A(8 j (t),X {tn - 1) )} = l. 

Then according to the above claim, 

Pr {node Sj(£) is extended by the MLSDA} 

node Sj(£) is extended 



Pr{^( ai (/) J « (to _ 1) )}Pr 



< max Pr 



node Sj{£) is extended 



e5,- (/) by t he MLSDA 



A (sj(^),cc^ n _i)) 

by the MLSDA 

A(s j {£),x {tn - 1) )) (23) 



< 



max c ,« Pr {C(®(/n-l)|0(/„-l)) < C(aJ*|0)} 

B (ln-l)Wj(«) 

- maX , ,^ Pr {C(*(/n-l)|0«n-l)) ^ C(O|0)} 

a5(^n-l)S>Sj(£) 

3=0 j=0 J 

where the replacement of x* by the all-zero codeword follows from £(x*\(f>) < £(O|0). We 
then observe that for the AWGN channel defined through (|20l) . </>j = A\^Erj/N ; hence, jjj can 
be determined by 

1, if Tj < 0; 
0, otherwise. 
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This observation, together with the fact that 2(yj ®Xj)\rj\ = Tj[(— l) Vj — (— l) Xj ], gives 
Pr {node sj (£) is extended by the MLS DA} 



' in-l N-l 



{£n-l AT-1 
j=0 j=0 

{JV-1 

> r, + > min(r,-,0) < 

where J7"(#(fe-i)) is the set of index j, where < j < in — 1, for which = 1. As is 
Gaussian distributed with mean \J~E and variance Nq/2 due to the transmission of the all-zero 
codeword, Proposition Q] (in the Appendix) and Lemma [2] can be applied to obtain 

Pr {node Sj{l) is extended by the MLSDA} 

{JV-l 
ri + • • • + + > min(r,-, 0) < 

AT-l 



Pr < ri H h r d * w + ^ min(rj, 0) < 



j 1 



i. r 



Consequently, 



where the multiplication of 2 k is due to the fact that whenever a node is extended, 2 k branch 
metric computations will follow. ■ 

D. Numerical and simulation results 

The accuracy of the previously derived theoretical upper bound for the computational effort 
of the MLSDA is now empirically examined using two types of convolutional codes. One is 
a (2, 1,6) code with generators 634,564 (octal); the other is a (2, 1, 16) code with generators 
1632044, 1145734 (octal). The lengths of the applied information bits are 60 and 100. 
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Fig. 6. Average computational complexity of the MLSDA for (2,1,6) convolutional code with 
generators 634, 564 (octal) and information length L — 100. 

Figures [6]-[9] present the deviation between the simulated results and the two theoretical upper 
bounds on the computational complexity of the MLSDA. According to these figures, the Berry- 
Esseen-enhanced theoretical upper bound is fairly close to the simulation results for both high % 
(above 6 dB) and low 75 (below 2 dB). Even for moderate 7&, they only differ by no more than 
0.8 for Figs. |6]-[9] on a log 10 scale. The differences between the two theoretical upper bounds 
with and without Berry-Esseen analysis are now visible in these figures. For example, the ratios 
of the two theoretical bounds are respectively 0.86, 0.90 and 0.95 at 4.0 dB, 4.5 dB and 5.0 dB 
in Fig. E 

A side observation from these figures is that the codes with longer constraint length, although 
having a lower bit error rate, require more computations. However, such a tradeoff on constraint 
length and bit error rate can be moderately eased at high SNR. Notably, when 7& > 6 dB, the 
average computational effort of the MLSDA in all four figures is reduced to approximately 2 k L 
in spite of the constraint length. 

V. Conclusions 

In terms of the large deviations technique and Berry-Esseen theorem, this study established 
theoretical upper bounds on the computational effort of the simplified GDA and the MLSDA 
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Fig. 7. Average computational complexity of the MLSDA for (2,1,6) convolutional code with 
generators 634, 564 (octal) and information length L — 60. 



for AWGN channels. 

There may be two factors determining the accuracy of the complexity upper bound. The first 
factor is the accuracy of the large deviations probability bound for sum of independent samples 
in Lemma 2, and the second one is the accuracy of the estimate of the node extension probability 
for sequential-type decoding. We however found that the main inaccuracy may not come from 
the latter. Taking the GDA algorithm as an example, (fT6l) is actually the exact event for path xw 
to be extended by the simplified GDA, and (fT71) becomes equality when the maximum-likelihood 
decision is exactly the transmitted all-zero codewords. Notably, as long as the node expanding 
distribution for each node is known, the average decoding complexity can be exactly obtained 
(specifically, if Zj = 1 when node j is visited and expanded, and Zj = 0, otherwise, then the 
average number of computations is exactly 2^- E[Zj] = 2^ Pr[Zj = 1] since the extension 
of each path causes two branch metric computations). Hence, the main inaccuracy is due to the 
overestimate of the large deviations probability bound for sum of independent variables (and, of 
course, accumulating such overestimate by summing for all nodes may make worse the situation). 
Since the codes simulated for the GDA algorithm are of lengths 24 and 48 under which the large 
deviations probability bound is very inaccurate, the resultant complexity bound is also inaccurate, 
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Fig. 8. Average computational complexity of the MLSDA for (2, 1, 16) convolutional code with 
generators 1632044, 1145734 (octal) and information length L = 100. 



and Berry-Esseen inequality does not provide much help in decreasing such inaccuracy. As for 
the MLSDA algorithm, a looser estimate is used to bound the node expanding probability by 
replacing "summation" by "maximization" as shown in (|23T) . However, the resultant complexity 
bound is much more accurate simply because the large deviations probability bound is more 
exact at larger block length. 
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Appendix 

Proposition 1: For a fixed non-negative integer k, the probability mass of 

Pr {r\ + • • • + r,i + min(wi, 0) + . . . + min(u>fc, 0) < 0} 

is a decreasing function for non-negative integer d, provided that r 1? r 2 , . . ., r d , w x , w 2 , ■ ■ ., w k 
are i.i.d. with a Gaussian marginal distribution of positive mean ji and variance a 2 . 

Proof: Assume without loss of generality that a 2 = 1. Also, assume k > 1 since the 
proposition is trivially valid for k = 0. 

Let ild = ri + ■ ■ ■ + r d . Denote the probability density function of w\ by /(■). Then putting 



29 



v = Pr{ifj = 0} yields 

Pr {Q d + wi + w 2 + • ■ • + w k < 0} 
= ^ Pr {exactly (/c — j) zeros in (iwi, w 2 , • • • , Wk)} 

Pi {Qd + Wi + W2 + ■ ■ ■ + Wk < 0| exactly (& — j) zeros in (w\, w 2 , ■ ■ ■ , w*;)} 
= Q !/* Pr{fi d < 0} + f*) v k -\l - v) J f(x) Pr{Q d < -x}dx 

/ f(xi)f(x 2 ) Pr{fi d < -(xi + x 2 )}dx 1 dx 2 

-oo J —oo 



+( ■■ )z/*- 2 fl-z/) 2 



fc 

+ ••• 

■"/ /( a; i)---/(^) pr {^<-(^+---+^)}^ 1 ---dx fe . 

Accordingly, if each of the above (k + 1) terms is non-increasing in d, so is their sum. Let 

"0 r o 



/U H) 
■■ f(%i) ■ ■ ■ f{xj) P?{Qd < H h ■ • ■ da^ 

-oo »/ — oo 

= J - J f{xi) ■ ■ ■ f{xj)<& (~ Xl + ^ + Xj - Vdf^J &r-- dxj. 



Then 

dqj(d) 



d Vd 



o 

/(Xi) • • • f(Xj) 



X 



/ X 1 + ...+ a ; 3 . _ \ l_ (xi+ ... +Xj+d .^ /{2d) 

\ d j y/nt 3 



r0 /-O 

< / • • • / /(xi) • • • f{x j )e- {xi+ - +x > +d - fl)2/i2d) dx 1 ■ ■ ■ dxj (24) 
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< o, 

where (l24l) follows from x« < (according to the range of integration) for 1 < i < j. 
Consequently, qj(d) is decreasing in d for d positive and every 1 < j < k. The proof is 
completed by noting that the first term, Pi{Ql d < 0} = y/d/j,), is also decreasing in d. ■ 
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